Goto

Collaborating Authors

 out-of-distribution example





Using Self-Supervised Learning Can Improve Model Robustness and Uncertainty

Dan Hendrycks, Mantas Mazeika, Saurav Kadavath, Dawn Song

Neural Information Processing Systems

Self-supervised learning holds great promise for improving representations when labeled data are scarce. In semi-supervised learning, recent self-supervision methods are state-of-the-art [Gidaris et al., 2018, Dosovitskiy et al., 2016, Zhai et al., 2019], and self-supervision is essential in video tasks where annotation is costly [V ondrick et al., 2016, 2018].



Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles

Balaji Lakshminarayanan, Alexander Pritzel, Charles Blundell

Neural Information Processing Systems

Deep neural networks (NNs) are powerful black box predictors that have recently achieved impressive performance on a wide spectrum of tasks. Quantifying predictive uncertainty in NNs is a challenging and yet unsolved problem. Bayesian NNs, which learn a distribution over weights, are currently the state-of-the-art for estimating predictive uncertainty; however these require significant modifications to the training procedure and are computationally expensive compared to standard (non-Bayesian) NNs. We propose an alternative to Bayesian NNs that is simple to implement, readily parallelizable, requires very little hyperparameter tuning, and yields high quality predictive uncertainty estimates. Through a series of experiments on classification and regression benchmarks, we demonstrate that our method produces well-calibrated uncertainty estimates which are as good or better than approximate Bayesian NNs. To assess robustness to dataset shift, we evaluate the predictive uncertainty on test examples from known and unknown distributions, and show that our method is able to express higher uncertainty on out-of-distribution examples. We demonstrate the scalability of our method by evaluating predictive uncertainty estimates on ImageNet.


Understanding the Effect of Out-of-distribution Examples and Interactive Explanations on Human-AI Decision Making

Liu, Han, Lai, Vivian, Tan, Chenhao

arXiv.org Artificial Intelligence

Although AI holds promise for improving human decision making in societally critical domains, it remains an open question how human-AI teams can reliably outperform AI alone and human alone in challenging prediction tasks (also known as complementary performance). We explore two directions to understand the gaps in achieving complementary performance. First, we argue that the typical experimental setup limits the potential of human-AI teams. To account for lower AI performance out-of-distribution than in-distribution because of distribution shift, we design experiments with different distribution types and investigate human performance for both in-distribution and out-of-distribution examples. Second, we develop novel interfaces to support interactive explanations so that humans can actively engage with AI assistance. Using in-person user study and large-scale randomized experiments across three tasks, we demonstrate a clear difference between in-distribution and out-of-distribution, and observe mixed results for interactive explanations: while interactive explanations improve human perception of AI assistance's usefulness, they may magnify human biases and lead to limited performance improvement. Overall, our work points out critical challenges and future directions towards complementary performance.


Robust Deep Learning Ensemble against Deception

Wei, Wenqi, Liu, Ling

arXiv.org Machine Learning

Deep neural network (DNN) models are known to be vulnerable to maliciously crafted adversarial examples and to out-of-distribution inputs drawn sufficiently far away from the training data. How to protect a machine learning model against deception of both types of destructive inputs remains an open challenge. This paper presents XEnsemble, a diversity ensemble verification methodology for enhancing the adversarial robustness of DNN models against deception caused by either adversarial examples or out-of-distribution inputs. XEnsemble by design has three unique capabilities. First, XEnsemble builds diverse input denoising verifiers by leveraging different data cleaning techniques. Second, XEnsemble develops a disagreement-diversity ensemble learning methodology for guarding the output of the prediction model against deception. Third, XEnsemble provides a suite of algorithms to combine input verification and output verification to protect the DNN prediction models from both adversarial examples and out of distribution inputs. Evaluated using eleven popular adversarial attacks and two representative out-of-distribution datasets, we show that XEnsemble achieves a high defense success rate against adversarial examples and a high detection success rate against out-of-distribution data inputs, and outperforms existing representative defense methods with respect to robustness and defensibility.


Deep Residual Flow for Novelty Detection

Zisselman, Ev, Tamar, Aviv

arXiv.org Machine Learning

The effective application of neural networks in the real-world relies on proficiently detecting out-of-distribution examples. Contemporary methods seek to model the distribution of feature activations in the training data for adequately distinguishing abnormalities, and the state-of-the-art method uses Gaussian distribution models. In this work, we present a novel approach that improves upon the state-of-the-art by leveraging an expressive density model based on normalizing flows. We introduce the residual flow, a novel flow architecture that learns the residual distribution from a base Gaussian distribution. Our model is general, and can be applied to any data that is approximately Gaussian. For novelty detection in image datasets, our approach provides a principled improvement over the state-of-the-art. Specifically, we demonstrate the effectiveness of our method in ResNet and DenseNet architectures trained on various image datasets. For example, on a ResNet trained on CIFAR-100 and evaluated on detection of out-of-distribution samples from the ImageNet dataset, holding the true positive rate (TPR) at $95\%$, we improve the true negative rate (TNR) from $56.7\%$ (current state-of-the-art) to $77.5\%$ (ours).


Detecting Adversarial Examples and Other Misclassifications in Neural Networks by Introspection

Aigrain, Jonathan, Detyniecki, Marcin

arXiv.org Machine Learning

Despite having excellent performances for a wide variety of tasks, modern neural networks are unable to provide a reliable confidence value allowing to detect misclassifications. This limitation is at the heart of what is known as an adversarial example, where the network provides a wrong prediction associated with a strong confidence to a slightly modified image. Moreover, this overconfidence issue has also been observed for regular errors and out-of-distribution data. We tackle this problem by what we call introspection, i.e. using the information provided by the logits of an already pretrained neural network. We show that by training a simple 3-layers neural network on top of the logit activations, we are able to detect misclassifications at a competitive level.